Automata for Positive Core XPath Queries on Compressed Documents
نویسندگان
چکیده
Given any dag t representing a fully or partially compressed XML document, we present a method for evaluating any positive unary query expressed in terms of Core XPath axes, on t, without unfolding t into a tree. To each Core XPath query of a certain basic type, we associate a word automaton; these automata run on the graph of dependency between the non-terminals of the straightline regular tree grammar associated to the given dag, or along complete sibling chains in this grammar. Any given Core XPath query can be decomposed into queries of the basic type, and the answer to the query, on the dag t, can then be expressed as a sub-dag of t suitably labeled under the runs of such automata.
منابع مشابه
Automata for Analyzing and Querying Compressed Documents
In a first part of this work, tree/dag automata are defined as extensions of (unranked) tree automata which can run indifferently on trees or dags; they can thus serve as tools for analyzing or querying any semi-structured document, whether or not given in a compressed format. In a second part of the work, we present a method for evaluating positive unary queries, expressed in terms of Core XPa...
متن کاملQuery Evaluation on Compressed Trees
This paper studies the problem of evaluating unary (or nodeselecting) queries on unranked trees compressed in a natural structure-preserving way, by the sharing of common subtrees. The motivation to study unary queries on unranked trees comes from the database field, where querying XML documents, which can be considered as unranked labelled trees, is an important task. We give algorithms and co...
متن کاملThe complexity of tree automata and XPath on grammar-compressed trees
The complexity of various membership problems for tree automata on compressed trees is analyzed. Two compressed representations are considered: dags, which allow to share identical subtrees in a tree, and straight-line context-free tree grammars, which moreover allow to share identical intermediate parts in a tree. Several completeness results for the classes NL, P, and PSPACE are obtained. Fin...
متن کاملCompression vs Queryability - A Case Study
Some compromise on compression is known to be necessary, if the relative positions of the information stored by semi-structured documents are to remain accessible under queries. With this in view, we compare, on an example, the ‘query-friendliness’ of XML documents, when compressed into straightline tree grammars which are either regular or context-free. The queries considered are in a limited ...
متن کاملFast In-Memory XPath Search over Compressed Text and Tree Indexes
A large fraction of an XML document typically consists of text data. The XPath query language allows text search via the equal, contains, and starts-with predicates. Such predicates can efficiently be implemented using a compressed self-index of the document’s text nodes. Most queries, however, contain some parts of querying the text of the document, plus some parts of querying the tree structu...
متن کامل